Analisaremos as temporadas das séries de comédia: The Big Bang Theory, Brooklyn Nine-Nine, Modern Family, Two And a Half Men

dados = read_csv(here("data/series_from_imdb.csv"), 
                    progress = FALSE,
                    col_types = cols(.default = col_double(), 
                                     series_name = col_character(), 
                                     episode = col_character(), 
                                     url = col_character(),
                                     season = col_character()))

1 Qual das séries que você escolheu é mais bem avaliada no IMDB? A diferença é grande? Pequena?

dados = dados %>% 
    filter(series_name %in% c("Two And a Half Men", "The Big Bang Theory","Brooklyn Nine-Nine","Modern Family"))
dados$season <- as.numeric((dados$season))
sumarios = dados %>% 
    group_by(series_name,season) %>% 
    summarise(
        season_rating = round(mean(user_rating),2))
  sumarios$season <- as.numeric((sumarios$season))
ggplotly( sumarios %>%
    ggplot(aes(x = season, y = season_rating, color = series_name,group=series_name)) + 
  geom_line() + 
  geom_point() +
  scale_x_continuous(breaks = (1:12))+
  labs(x="Temporada", y= "Avalição média", color="Nome da série"),
  tooltip = c("season_rating"))
We recommend that you use the dev version of ggplot2 with `ggplotly()`
Install it with: `devtools::install_github('hadley/ggplot2')`

2.1. Observando o gráfico gerado na questão 2, os episódios mais amados são mais votados?

Para responder essa questão, irei selecionar os mesmos episódios que obtiveram mais de 10 mil votos, e analisarei os que tiveram mais votos 10.

newData = dados %>% 
       filter(user_votes > 8000) %>%
        mutate(
        rank_odiados = row_number(r1), # maior = mais odiado
        rank_amados = row_number(r10)) # maior = mais amado
plot_ly( newData,x = newData$r1, y = newData$r10, 
        text = paste("Número de votos: ", newData$user_votes),
        mode = "markers", color = newData$user_votes, size = newData$user_votes)
No trace type specified:
  Based on info supplied, a 'scatter' trace seems appropriate.
  Read more about this trace type -> https://plot.ly/r/reference/#scatter
No trace type specified:
  Based on info supplied, a 'scatter' trace seems appropriate.
  Read more about this trace type -> https://plot.ly/r/reference/#scatter

Utilizando um gráfico de bolas, e analizando os mesmos episódios com mais de 10 mil votos da questão 2, é possível analisar que existe uma tendência aos episódios com maior número de votos serem os episódios com maior número de notas 10.

LS0tCnRpdGxlOiAiUHJvYmxlbWEgMiAtIENoZWNrcG9pbnQgMSIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6CiAgICB0b2M6IHllcwogICAgdG9jX2Zsb2F0OiB5ZXMKICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCiAgICB0b2M6IHllcwogICAgdG9jX2Zsb2F0OiB5ZXMKLS0tCgpgYGB7ciBzZXR1cCwgZWNobz1GQUxTRSwgd2FybmluZz1GQUxTRSwgbWVzc2FnZT1GQUxTRX0KbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoaGVyZSkKbGlicmFyeShwbG90bHkpCmxpYnJhcnkodmlyaWRpcykKdGhlbWVfc2V0KHRoZW1lX2J3KCkpCmBgYAoKCkFuYWxpc2FyZW1vcyBhcyB0ZW1wb3JhZGFzIGRhcyBzw6lyaWVzIGRlIGNvbcOpZGlhOiBUaGUgQmlnIEJhbmcgVGhlb3J5LCBCcm9va2x5biBOaW5lLU5pbmUsIE1vZGVybiBGYW1pbHksIFR3byBBbmQgYSBIYWxmIE1lbgoKYGBge3J9CmRhZG9zID0gcmVhZF9jc3YoaGVyZSgiZGF0YS9zZXJpZXNfZnJvbV9pbWRiLmNzdiIpLCAKICAgICAgICAgICAgICAgICAgICBwcm9ncmVzcyA9IEZBTFNFLAogICAgICAgICAgICAgICAgICAgIGNvbF90eXBlcyA9IGNvbHMoLmRlZmF1bHQgPSBjb2xfZG91YmxlKCksIAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgc2VyaWVzX25hbWUgPSBjb2xfY2hhcmFjdGVyKCksIAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZXBpc29kZSA9IGNvbF9jaGFyYWN0ZXIoKSwgCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB1cmwgPSBjb2xfY2hhcmFjdGVyKCksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzZWFzb24gPSBjb2xfY2hhcmFjdGVyKCkpKQpgYGAKCiMjMSBRdWFsIGRhcyBzw6lyaWVzIHF1ZSB2b2PDqiBlc2NvbGhldSDDqSBtYWlzIGJlbSBhdmFsaWFkYSBubyBJTURCPyBBIGRpZmVyZW7Dp2Egw6kgZ3JhbmRlPyBQZXF1ZW5hPwpgYGB7cn0KZGFkb3MgPSBkYWRvcyAlPiUgCiAgICBmaWx0ZXIoc2VyaWVzX25hbWUgJWluJSBjKCJUd28gQW5kIGEgSGFsZiBNZW4iLCAiVGhlIEJpZyBCYW5nIFRoZW9yeSIsIkJyb29rbHluIE5pbmUtTmluZSIsIk1vZGVybiBGYW1pbHkiKSkKZGFkb3Mkc2Vhc29uIDwtIGFzLm51bWVyaWMoKGRhZG9zJHNlYXNvbikpCgpzdW1hcmlvcyA9IGRhZG9zICU+JSAKICAgIGdyb3VwX2J5KHNlcmllc19uYW1lLHNlYXNvbikgJT4lIAogICAgc3VtbWFyaXNlKAogICAgICAgIHNlYXNvbl9yYXRpbmcgPSByb3VuZChtZWFuKHVzZXJfcmF0aW5nKSwyKSkKICBzdW1hcmlvcyRzZWFzb24gPC0gYXMubnVtZXJpYygoc3VtYXJpb3Mkc2Vhc29uKSkKCmdncGxvdGx5KCBzdW1hcmlvcyAlPiUKICAgIGdncGxvdChhZXMoeCA9IHNlYXNvbiwgeSA9IHNlYXNvbl9yYXRpbmcsIGNvbG9yID0gc2VyaWVzX25hbWUsZ3JvdXA9c2VyaWVzX25hbWUpKSArIAogIGdlb21fbGluZSgpICsgCiAgZ2VvbV9wb2ludCgpICsKICBzY2FsZV94X2NvbnRpbnVvdXMoYnJlYWtzID0gKDE6MTIpKSsKICBsYWJzKHg9IlRlbXBvcmFkYSIsIHk9ICJBdmFsacOnw6NvIG3DqWRpYSIsIGNvbG9yPSJOb21lIGRhIHPDqXJpZSIpLAogIHRvb2x0aXAgPSBjKCJzZWFzb25fcmF0aW5nIikpCgpgYGAKCiMjMi4xLiBPYnNlcnZhbmRvIG8gZ3LDoWZpY28gZ2VyYWRvIG5hIHF1ZXN0w6NvIDIsIG9zIGVwaXPDs2Rpb3MgbWFpcyBhbWFkb3Mgc8OjbyBtYWlzIHZvdGFkb3M/CiMjIyBQYXJhIHJlc3BvbmRlciBlc3NhIHF1ZXN0w6NvLCBpcmVpIHNlbGVjaW9uYXIgb3MgbWVzbW9zIGVwaXPDs2Rpb3MgcXVlIG9idGl2ZXJhbSBtYWlzIGRlIDEwIG1pbCB2b3RvcywgZSBhbmFsaXNhcmVpIG9zIHF1ZSB0aXZlcmFtIG1haXMgdm90b3MgMTAuCgpgYGB7cn0KbmV3RGF0YSA9IGRhZG9zICU+JSAKICAgICAgIGZpbHRlcih1c2VyX3ZvdGVzID4gODAwMCkgJT4lCiAgICAgICAgbXV0YXRlKAogICAgICAgIHJhbmtfb2RpYWRvcyA9IHJvd19udW1iZXIocjEpLCAjIG1haW9yID0gbWFpcyBvZGlhZG8KICAgICAgICByYW5rX2FtYWRvcyA9IHJvd19udW1iZXIocjEwKSkgIyBtYWlvciA9IG1haXMgYW1hZG8KCgpwbG90X2x5KCBuZXdEYXRhLHggPSBuZXdEYXRhJHIxLCB5ID0gbmV3RGF0YSRyMTAsIAogICAgICAgIHRleHQgPSBwYXN0ZSgiTsO6bWVybyBkZSB2b3RvczogIiwgbmV3RGF0YSR1c2VyX3ZvdGVzKSwKICAgICAgICBtb2RlID0gIm1hcmtlcnMiLCBjb2xvciA9IG5ld0RhdGEkdXNlcl92b3Rlcywgc2l6ZSA9IG5ld0RhdGEkdXNlcl92b3RlcykKYGBgIApVdGlsaXphbmRvIHVtIGdyw6FmaWNvIGRlIGJvbGFzLCBlIGFuYWxpemFuZG8gb3MgbWVzbW9zIGVwaXPDs2Rpb3MgY29tIG1haXMgZGUgMTAgbWlsIHZvdG9zIGRhIHF1ZXN0w6NvIDIsIMOpIHBvc3PDrXZlbCBhbmFsaXNhciBxdWUgZXhpc3RlIHVtYSB0ZW5kw6puY2lhIGFvcyBlcGlzw7NkaW9zIGNvbSBtYWlvciBuw7ptZXJvIGRlIHZvdG9zIHNlcmVtIG9zIGVwaXPDs2Rpb3MgY29tIG1haW9yIG7Dum1lcm8gZGUgbm90YXMgMTAu